---
title: Agent Loops
description: Supported computer-using agent loops and models
---

<Callout>
  A corresponding{' '}
  <a href="https://github.com/trycua/cua/blob/main/notebooks/agent_nb.ipynb" target="_blank">
    Jupyter Notebook
  </a>{' '}
  is available for this documentation.
</Callout>

An agent can be thought of as a loop - it generates actions, executes them, and repeats until done:

1. **Generate**: Your `model` generates `output_text`, `computer_call`, `function_call`
2. **Execute**: The `computer` safely executes those items
3. **Complete**: If the model has no more calls, it's done!

To run an agent loop simply do:

```python
from agent import ComputerAgent
import asyncio
from computer import Computer


async def take_screenshot():
    async with Computer(
        os_type="linux",
        provider_type="cloud",
        name="your-sandbox-name",
        api_key="your-api-key"
    ) as computer:

        agent = ComputerAgent(
            model="anthropic/claude-3-5-sonnet-20241022",
            tools=[computer],
            max_trajectory_budget=5.0
        )

        messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]

        async for result in agent.run(messages):
            for item in result["output"]:
                if item["type"] == "message":
                    print(item["content"][0]["text"])


if __name__ == "__main__":
    asyncio.run(take_screenshot())
```

For a list of supported models and configurations, see the [Supported Agents](./supported-agents/computer-use-agents) page.

### Response Format

```python
{
    "output": [
        {
            "type": "message",
            "role": "assistant",
            "content": [{"type": "output_text", "text": "I can see..."}]
        },
        {
            "type": "computer_call",
            "action": {"type": "screenshot"},
            "call_id": "call_123"
        },
        {
            "type": "computer_call_output",
            "call_id": "call_123",
            "output": {"image_url": "data:image/png;base64,..."}
        }
    ],
    "usage": {
        "prompt_tokens": 150,
        "completion_tokens": 75,
        "total_tokens": 225,
        "response_cost": 0.01,
    }
}
```

### Environment Variables

Use the following environment variables to configure the agent and its access to cloud computers and LLM providers:

```bash
# Computer instance (cloud)
export CUA_SANDBOX_NAME="your-sandbox-name"
export CUA_API_KEY="your-cua-api-key"

# LLM API keys
export ANTHROPIC_API_KEY="your-anthropic-key"
export OPENAI_API_KEY="your-openai-key"
```

### Input and output

The input prompt passed to `Agent.run` can either be a string or a list of message dictionaries:

```python
messages = [
    {
        "role": "user",
        "content": "Take a screenshot and describe what you see"
    },
    {
        "role": "assistant",
        "content": "I'll take a screenshot for you."
    }
]
```

The output is an AsyncGenerator that yields response chunks.

### Parameters

The `ComputerAgent` constructor provides a wide range of options for customizing agent behavior, tool integration, callbacks, resource management, and more.

- `model` (`str`): Default: **required**
  The LLM or agent model to use. Determines which agent loop is selected unless `custom_loop` is provided. (e.g., "claude-3-5-sonnet-20241022", "computer-use-preview", "omni+vertex_ai/gemini-pro")
- `tools` (`List[Any]`):
  List of tools the agent can use (e.g., `Computer`, sandboxed Python functions, etc.).
- `custom_loop` (`Callable`):
  Optional custom agent loop function. If provided, overrides automatic loop selection.
- `only_n_most_recent_images` (`int`):
  If set, only the N most recent images are kept in the message history. Useful for limiting memory usage. Automatically adds `ImageRetentionCallback`.
- `callbacks` (`List[Any]`):
  List of callback instances for advanced preprocessing, postprocessing, logging, or custom hooks. See [Callbacks & Extensibility](#callbacks--extensibility).
- `verbosity` (`int`):
  Logging level (e.g., `logging.INFO`). If set, adds a logging callback.
- `trajectory_dir` (`str`):
  Directory path to save full trajectory data, including screenshots and responses. Adds `TrajectorySaverCallback`.
- `max_retries` (`int`): Default: `3`
  Maximum number of retries for failed API calls (default: 3).
- `screenshot_delay` (`float` | `int`): Default: `0.5`
  Delay (in seconds) before taking screenshots (default: 0.5).
- `use_prompt_caching` (`bool`): Default: `False`
  Enables prompt caching for repeated prompts (mainly for Anthropic models).
- `max_trajectory_budget` (`float` | `dict`):
  If set (float or dict), adds a budget manager callback that tracks usage costs and stops execution if the budget is exceeded. Dict allows advanced options (e.g., `{ "max_budget": 5.0, "raise_error": True }`).
- `instructions` (`str` | `list[str]`):
  System instructions for the agent. Can be a single string or multiple strings in a tuple/list for readability; they are concatenated into one system prompt.
- `api_key` (`str`):
  Optional API key override for the model provider.
- `api_base` (`str`):
  Optional API base URL override for the model provider.
- `**additional_generation_kwargs` (`any`):
  Any additional keyword arguments are passed through to the agent loop or model provider.

**Example with advanced options:**

```python
from agent import ComputerAgent
from computer import Computer
from agent.callbacks import ImageRetentionCallback

agent = ComputerAgent(
    model="anthropic/claude-3-5-sonnet-20241022",
    tools=[Computer(...)],
    only_n_most_recent_images=3,
    callbacks=[ImageRetentionCallback(only_n_most_recent_images=3)],
    verbosity=logging.INFO,
    trajectory_dir="trajectories",
    max_retries=5,
    screenshot_delay=1.0,
    use_prompt_caching=True,
    max_trajectory_budget={"max_budget": 5.0, "raise_error": True},
    instructions=(
        "You are a helpful computer-using agent"
        "Output computer calls until you complete the given task"
    ),
    api_key="your-api-key",
    api_base="https://your-api-base.com/v1",
)
```

### Streaming Responses

```python
async for result in agent.run(messages, stream=True):
    # Process streaming chunks
    for item in result["output"]:
        if item["type"] == "message":
            print(item["content"][0]["text"], end="", flush=True)
        elif item["type"] == "computer_call":
            action = item["action"]
            print(f"\n[Action: {action['type']}]")
```

### Error Handling

```python
try:
    async for result in agent.run(messages):
        # Process results
        pass
except BudgetExceededException:
    print("Budget limit exceeded")
except Exception as e:
    print(f"Agent error: {e}")
```
